8 research outputs found

    Transportation in Social Media: an automatic classifier for travel-related tweets

    Full text link
    In the last years researchers in the field of intelligent transportation systems have made several efforts to extract valuable information from social media streams. However, collecting domain-specific data from any social media is a challenging task demanding appropriate and robust classification methods. In this work we focus on exploring geo-located tweets in order to create a travel-related tweet classifier using a combination of bag-of-words and word embeddings. The resulting classification makes possible the identification of interesting spatio-temporal relations in S\~ao Paulo and Rio de Janeiro

    Characterizing Geo-located Tweets in Brazilian Megacities

    Full text link
    This work presents a framework for collecting, processing and mining geo-located tweets in order to extract meaningful and actionable knowledge in the context of smart cities. We collected and characterized more than 9M tweets from the two biggest cities in Brazil, Rio de Janeiro and S\~ao Paulo. We performed topic modeling using the Latent Dirichlet Allocation model to produce an unsupervised distribution of semantic topics over the stream of geo-located tweets as well as a distribution of words over those topics. We manually labeled and aggregated similar topics obtaining a total of 29 different topics across both cities. Results showed similarities in the majority of topics for both cities, reflecting similar interests and concerns among the population of Rio de Janeiro and S\~ao Paulo. Nevertheless, some specific topics are more predominant in one of the cities

    A Biomedical Entity Extraction Pipeline for Oncology Health Records in Portuguese

    Full text link
    Textual health records of cancer patients are usually protracted and highly unstructured, making it very time-consuming for health professionals to get a complete overview of the patient's therapeutic course. As such limitations can lead to suboptimal and/or inefficient treatment procedures, healthcare providers would greatly benefit from a system that effectively summarizes the information of those records. With the advent of deep neural models, this objective has been partially attained for English clinical texts, however, the research community still lacks an effective solution for languages with limited resources. In this paper, we present the approach we developed to extract procedures, drugs, and diseases from oncology health records written in European Portuguese. This project was conducted in collaboration with the Portuguese Institute for Oncology which, besides holding over 1010 years of duly protected medical records, also provided oncologist expertise throughout the development of the project. Since there is no annotated corpus for biomedical entity extraction in Portuguese, we also present the strategy we followed in annotating the corpus for the development of the models. The final models, which combined a neural architecture with entity linking, achieved F1F_1 scores of 88.688.6, 95.095.0, and 55.855.8 per cent in the mention extraction of procedures, drugs, and diseases, respectively

    Report on the Second International Workshop on Narrative Extraction from Texts (Text2Story 2019)

    Get PDF
    The Second International Workshop on Narrative Extraction from Texts (Text2Story’19 [http://text2story19.inesctec.pt/]) was held on the 14th of April 2019, in conjunction with the 41st European Conference on Information Retrieval (ECIR 2019) in Cologne, Germany. The workshop provided a platform for researchers in IR, NLP, and design and visualization to come together and share the recent advances in extraction and formal representation of narratives. The workshop consisted of two invited talks, ten research paper presentations, and a poster and demo session. The proceedings of the workshop are available online at http://ceur-ws.org/Vol-2342/info:eu-repo/semantics/publishedVersio

    ECIR 2018: Text2Story Workshop-Narrative Extraction from Texts

    Get PDF
    The 1st International Workshop on Narrative Extraction from Texts (Text2Story 2018) was held in conjunction with the 40th European Conference on Information Retrieval, ECIR 2018, Grenoble on the 26th March 2018. The workshop aimed to help foster the collaboration of researchers on a wide range of multidisciplinary issues related to the text-to-narrative- structure. The program consisted of two keynote talks, six research presentations, a poster session and a slot for demo presentations. This report briefly summarizes the workshop.info:eu-repo/semantics/publishedVersio

    Algorithmic Science News: support platform for science journalism

    Get PDF
    A plataforma Algorithmic Science News (ASN) é uma nova ferramenta criada por uma equipa multidisciplinar que surgiu da necessidade de reinscrever o papel das notícias sobre ciência no espaço mediático presente. A plataforma tem como objetivo aumentar o número de notícias científicas disponíveis para os editores e reduzir o esforço associado a tarefas mais demoradas como a recolha de dados e análise de artigos científicos, facilitando todo o processo de produção de notícias. A plataforma ASN agrega um conjunto de funcionalidades destinadas a apoiar o trabalho habitual de um jornalista em contextos redatoriais, permitindo a utilização de documentos em repositórios científicos de acesso aberto. O desenvolvimento de algoritmos que trabalham sobre estes repositórios seguiu o propósito de facilitar o acesso e a exploração destas coleções e permitir que os órgãos de comunicação as utilizem como fonte informativa. Neste projeto desenvolveram-se ferramentas de leitura e interpretação, escrita e sugestão semântica. No que concerne à leitura e interpretação, o ASN permite encontrar especialistas relacionados com o artigo científico, resumir as partes mais determinantes, nomeadamente a introdução, objetivos, metodologias e resultados, apresentar definições de termos técnicos e sugerir projetos relacionados com o tema. No que diz respeito à parte de escrita, a plataforma permite escrever notas relacionadas com partes do artigo e ter acesso a sugestões de frases. A utilização de uma versão beta desta plataforma em contextos redatoriais permitirá perceber até que ponto a automação de tarefas associadas à produção jornalística poderá ajudar os meios de comunicação em transição para o digital, assim como contribuir para uma maior eficiência nas tarefas associadas ao jornalismo de ciência, permitindo uma maior massificação e qualidade na produção de notícias científicas no panorama mediático atual.The Algorithmic Science News (ASN) platform is a new tool created by a multidisciplinary team that emerged from the need to reinscribe the role of science news in today's media space. The platform aims not only to increase the number of scientific news available to publishers, but also to reduce the effort associated with more time-consuming tasks such as collecting data and analyzing scientific articles and thereby facilitating the entire news production process. The ASN platform includes a set of features designed to support the journalist's standard procedure in writing contexts, allowing the use of documents in open access scientific repositories. The development of algorithms working on these repositories followed the purpose of facilitating the access and exploration of these collections and allowing the media to use them as an information source. In this project, we developed tools of reading and interpretation, writing and semantic suggestion. With regard to reading and interpretation, ASN helps on finding specialists related to the scientific article, summarizes the most determinant parts, namely the introduction, objectives, methodologies and results, present definitions of technical terms, and suggest projects related to the topic. With regard to the writing part, the platform allows to write notes related to parts of the article and have access to phrases suggestions. The use of a beta version of this platform in writing contexts will allow the realization of the extent to which the automation of tasks associated with journalistic production can help the media in transition to digital, as well as to contribute to a greater efficiency in the tasks associated with science journalism, allowing a greater massification and quality in the production of scientific news in the current media landscape
    corecore